Sequence Discrimination Using Phase-Type Distributions

نویسندگان

  • Jérôme Callut
  • Pierre Dupont
چکیده

We propose in this paper a novel approach to the classification of discrete sequences. This approach builds a model fitting some dynamical features deduced from the learning sample. These features are discrete phase-type (PH) distributions. They model the first passage times (FPT) between occurrences of pairs of substrings. The PHit algorithm, an adapted version of the Expectation-Maximization algorithm, is proposed to estimate PH distributions. The most informative pairs of substrings are selected according to the Jensen-Shannon divergence between their class conditional empirical FPT distributions. The selected features are then used in two classification schemes: a maximum a posteriori (MAP) classifier and support vector machines (SVM) with marginalized kernels. Experiments on DNA splicing region detection and on protein sublocalization illustrate that the proposed techniques offer competitive results with smoothed Markov chains or SVM with a spectrum string kernel.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification and properties of acyclic discrete phase-type distributions based on geometric and shifted geometric distributions

Acyclic phase-type distributions form a versatile model, serving as approximations to many probability distributions in various circumstances. They exhibit special properties and characteristics that usually make their applications attractive. Compared to acyclic continuous phase-type (ACPH) distributions, acyclic discrete phase-type (ADPH) distributions and their subclasses (ADPH family) have ...

متن کامل

Using Coxian phase-type distributions to identify patient characteristics for duration of stay in hospital.

Coxian phase-type distributions are a special type of Markov model that describes duration until an event occurs in terms of a process consisting of a sequence of latent phases. This paper considers the use of Coxian phase-type distributions for modelling patient duration of stay for the elderly in hospital and investigates the potential for using the resulting distribution as a classifying var...

متن کامل

Discrimination performance of single neurons: rate and temporal-pattern information.

1. A new method of measuring the performance of neurons in sensory discrimination tasks was developed and then applied to single-neuron responses recorded in the auditory nerve of chinchilla and in the striate visual cortex of cat. 2. Most previous methods of measuring discrimination performance have employed decision rules that involve comparing the total counts of action potentials (spikes) p...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Detection and Discrimination of Theileria annulata and Theileria lestoquardi by using a Single PCR

  The aim of this study was to detect and differentiate Theileria annulata and T. lestoquardi (hirci) by PCR. Members of the genus Theileria are tick-borne hemoprotozoan parasites those cause fatal and enervating diseases of cattle and sheep in Iran . In order to develop a specific method for detecting and identification of Theileria species, specific primers from the surface protein (SP) seque...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006